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Abstract: 

Studies of the anther transcriptome on non-model plants without a known 
genome are surprisingly scarce. RNA-Seq and digital gene expression (DGE) 
profiling provides a comprehensive approach to identify candidate genes contributing 
to developmental processes in non-model species. Here we built a transcriptome 
library of developing anthers of Hamelia patens and analysed DGE profiles from each 
stage to identify genes that regulate tapetum and pollen development. In total 7,720 
putative differentially expressed genes across four anther stages were identified. The 
number of putative stage-specific genes was: 776 at microspore mother cell stage 
(MMC), 807 at tetrad stage (TET), 322 at uninucleate microspore stage (UNM) and 
the highest number (1,864) at bicellular pollen stage (BCP). GO enrichment analysis 
revealed 243 differentially expressed and 108 stage-specific genes that are potentially 
related to tapetum development, sporopollenin synthesis and pollen wall. The number 
of expressed genes, their function and expression profiles were all significantly 
correlated with anther developmental processes. Overall comparisons of anther and 
pollen transcriptomes with those of rice and Arabidopsis together with the expression 
profiles of homologs of known anther-expressed genes, revealed conserved patterns 
and also divergence. The divergence may reflect taxon-specific differences in gene 
expression, the use RNA-seq as a more sensitive methodology, variation in tissue 
composition and sampling strategies. Given the lack of genomic sequence, this study 
succeeded in assigning putative identity to a significant proportion of 
anther-expressed genes and genes relevant to tapetum and pollen development in H. 
patens. The anther transcriptome revealed a molecular distinction between 
developmental stages, serving as a resource to unravel the functions of genes involved 
in anther development in H. patens and informing the analysis of other members of 
the Rubiaceae. 
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Introduction 

Pollen grains are the microgametophytes of seed plants that produce the male 
gametes required for sexual reproduction (Borg and Twell, 2011). Although the 
overall structure of pollen is conserved in angiosperms, pollen shows considerable 
variation in size, shape and surface characteristics (Heslop-Harrison, 1968; 
Blackmore et al., 2007). Recently, the diversity and evolution of palynological 
characters were documented across the angiosperms based on the most 
comprehensive classification APG III (2009). Many characters, such as exine 
structure and aperture features, have indicated key evolutionary transitions in pollen 
morphology and have proven to be informative at different taxonomic levels (Wortley 
et al., 2015; Lu et al., 2015). However, little is known about the genetic basis 
underlying pollen evolution. Comparative analysis has indicated significant 
conservation in anther gene expression patterns between Arabidopsis and maize (Ma 
et al., 2008). Further comparative analysis of anther and pollen transcriptome profiles 
for diverse angiosperms would complement understanding the conservation of the 
molecular mechanisms underlying pollen development (Rutley and Twell, 2015; 
Wilson and Zhang, 2009). 


Pollen development involves the coordination of cellular activities and the 
underlying gene expression in sporophytic and gametophytic cells of the anther 
(Ariizumi and Toriyama, 2011). Anthers are composed of developing gametophytes 
and a number of surrounding cell layers: the innermost tapetum, the middle layer, the 
endothecium and the outer epidermis. Initially, archesporial cells form in the L2 layer 
of the anther primordium and divide periclinally to form outer primary parietal cells 
and inner primary sporogenous cells. The primary sporogenous cells undergo 
divisions to form microspore mother cells (MMCs), whereas the primary parietal cells 
(PPCs) undergo a series of divisions to form the cell layers of the anther wall (Scott et 
al., 2004). The MMCs undergo meiosis producing tetrads of haploid microspores at 
tetrad stage (TET). The tetrads are usually separated as free microspores after the 
callosic walls are dissolved by callase at the uninucleate microspore stage (UNM). 
The free microspores divide asymmetrically to segregate the male germline and 
develop further into pollen grains before release at bicellular pollen stage (BCP) or 
tricellular pollen stage (TCP) (McCormick, 1993; Twell, 2011). Pollen development 
relies on the complex interactions between reproductive and non-reproductive tissues 
of the anther, especially the tapetum, which provides enzymes for the dissolution of 
tetrads, nutrients to the developing microspores and materials deposited onto the 
pollen exine (Goldberg et al., 1993; Tsuchiya et al., 1994; Li et al., 2006). 


A significant number of key genes required for anther and pollen development 
have been identified by forward genetic screens in model and crop plants. These 
genes, which include a significant number of transcriptional regulators, control 
formation and degeneration of the tapetum, microspore release from the tetrad and 
formation of the complex pollen wall (Boavida et al., 2005; Wilson and Zhang, 2009; 
Shi et al., 2015). Key genes involved in tapetum formation and differentiation include, 
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EXCESS MICROSPOROCYTESI / EXTRA SPOROGENOUS CELLS (EMSI/EXS) 
(Zhao et al., 2002), and TAPETUM DETERMINANT 1(TPD1) (Yang et al., 2003). 
Genes involved in the programmed cell death (PCD) of tapetum cells include TAZ/ 
(Kapoor et al., 2002), TAPETUM DEGENERATION RETARDATION (TDR) (Li et al., 
2006), PERSISTENT TAPETAL CELL 1 (РТСІ), APIS (Li et al., 2011), Osc6 (Zhang 
et al., 2010) and СЕРІ (Zhang D. D. et al., 2014). The absence of these genes results 
in abnormal tapetal PCD and sterile pollen. 

Transcriptional regulation is a major mechanism controlling anther development 
in Arabidopsis. The bHLH transcription factor DYSFUNCTIONAL TAPETUM (DYTI) 
plays a critical role in regulating tapetum function and pollen development (Zhang et 
al., 2006; Feng et al., 2012; Zhu et al., 2015). DYTI regulates the expression of the 
ABORTED MICROSPORES (AMS) (Sorensen et al., 2003; Xu et al., 2014), MALE 
STERILITY] (MSI) (Vizcay-Barrena and Wilson, 2006), and other tapetum 
preferential genes, primarily via DEFECTIVE IN TAPETAL DEVELOPMENT AND 
FUNCTION 1 (TDFI) (Gu et al., 2014). Important regulators of tetrad callose 
degradation include the Arabidopsis transcription factor gene А/МҮВ103 (Higginson 
et al., 2003; Zhang et al., 2007), and the rice В-1-3, glucanase encoded by Osg/ (Wan 
et al., 2011). Key genes required for pollen exine development have also been 
characterized, including MS2 (Aarts et al., 1997), ACOS5 (Souza et al., 2009), 
CYP703A2 (Morant et al., 2007), CYP703A3 (Aya et al., 2009) and EFD (Hu et al., 
2014). Moreover, recent co-expression analysis revealed that AMS acts as a master 
regulator coordinating pollen wall development and sporopollenin biosynthesis in 
Arabidopsis (Xu et al., 2014). Thus, current understanding of anther regulatory genes 
is overwhelmingly derived from studies of model plants and crops and little 
information about gene activities in developing anthers is known from non-model 
species. 


Transcriptome analyses have significantly enriched knowledge of the repertoire of 
genes expressed during pollen development. Genome-scale analyses have been 
described for developing anthers and pollen, germinating pollen and pollen tubes, and 
mature pollen transcriptome data for at least 10 different species (Huang et al., 2011; 
Rutley and Twell, 2015). Transcriptome profiling at different developmental stages 
have been reported for Arabidopsis (Honys and Twell, 2004), rice (Wei et al., 2010), 
and tobacco (Bokvaj et al., 2015). Transcriptome studies of developing Arabidopsis 
anthers and pollen have revealed genes that function as early as meiosis, in the 
tapetum and in exine formation (Dukowic-Schulze and Chen, 2014). Recently, 
genome-wide co-expression analysis revealed 98 candidate genes closely associated 
with pollen wall development (Xu et al., 2014). Moreover, comparative studies in rice 
involving microarray studies of developing anthers have also revealed complex 
transcriptomes (Deveshwar et al., 2011). 


RNA-Seq based methods have led to a dramatic acceleration in gene discovery 
(Barrero et al., 2011; Garg et al., 2011; Shi et al., 2011), which has rapidly broadened 
understanding of the complexity of gene networks and their regulation (Wang et al., 
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2010; Hua et al., 2011). Moreover, there are limitations of microarray technology such 
as lack of quantitative detection of transcripts with different probe sets, low 
abundance transcripts are difficult to detect, and a reference genome and/or expressed 
sequence tags (ESTs) are required (Huang et al., 2011; Zhao et al., 2014). Studies of 
anther transcriptomes based on RNA-seq are limited to those in a few crops and 
model plants, however, expression profiling of developing anthers by Next 
Generation Sequencing (NGS) from species whose genomes have not been sequenced 
is anticipated (Wang et al., 2009; Huang et al., 2011). 

Rubiaceae, commonly known as the coffee family, is the fourth largest in the 
angiosperms and has important economic utilities; in addition to the valuable 
beverage crop coffee, as a source of quinine, dyes and ornamentals. It is a 
eurypalynous family with great diversity in pollen size and shape, aperture type, 
number and position, pollen wall stratification and sexine ornamentation (Dessein et 
al, 2005). Some pollen features have significant value in stimulating further 
palynological investigations, for example, the development of Ubisch bodies provides 
a potential model for studying sporopollenin deposition (Dessein et al., 2005). 
However, there is limited information about the gene expression patterns underlying 
anther and/or pollen development in Rubiaceae. Current molecular data from model 
species and a few crop plants are insufficient to resolve the mechanisms that regulate 
the development, function and evolution of angiosperm pollen. More attention to such 
a palynologically well-documented family would help clarify the molecular processes 
underlying anther/pollen development in angiosperms and the associated evo-devo 
mechanisms. 


In this paper, Hamelia patens Jacq. was chosen for study as a member of the 
coffee family. H. patens originates in tropical America and is a popular ornamental 
plant in south and southwest China. The ease of cultivation and long blooming period 
from May to October enables facile collection of floral material. Further, this species 
produces triaperturate pollen grains, representing the plesiomorphic condition in the 
Rubiaceae family. These advantages make H. patens a potentially valuable system for 
developmental and molecular research in Rubiaceae. 


Transcriptome profiles of Hamelia patens developing anthers from microspore 
mother cell to bicellular pollen grain stages were investigated using Next Generation 
Sequencing (NGS) and NGS-based digital gene expression (DGE) tag profiling. The 
aims of the present study were, (1) to provide molecular insight into anther and pollen 
development of this phylogenetically and palynologically important family, and (2), to 
reveal novel candidate genes involved in tapetum and pollen development, to shed 
light on the evo-devo mechanisms of angiosperm pollen development. 


1. Materials and Methods 
1.1 Sample preparation for anther development 

Flowers of Hamelia patens Jacq. at various developmental stages were harvested 
from plants growing in South China Botanical Garden (SCBG), Guangdong Province, 
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China. Each flower contains five anthers. The length of flowers and the anthers was 
measured using a stereo microscope. The classification of flowers and anthers for 
categorization of anther developmental stages is summarized in Table 1. 

The anthers were fixed in 2.5 % glutaraldehyde in 0.1 mol/L phosphate buffer, 
pH 7.2, placed under a vacuum for 2 h, then stored at 4°C for several days. After 
removal from storage, the anthers were rinsed in 0.1 mol/L phosphate buffer for 2 h, 
and postfixed in 1 96 osmium tetroxide overnight. Following post-fixation, anthers 
were washed in phosphate buffer, dehydrated in an acetone series, embedded in 
PON-812 resin and cured at 70 °С. Semi-thin sections (1 - 2 um) were cut with glass 
knives on a LKB-11800 microtome, stained with 0.1% Toluidine Blue, and observed 
and photographed with a Leica - DM5500B light microscope (LeiCa, Germany). 
Ultrathin sections (80 nm) were cut using a Leica-Ultracut S ultramicrotome with a 
diamond knife, and stained with uranyl acetate and lead citrate. Transmission electron 
micrographs were taken with a JEOL JEM-1010 (JEOL, Japan) transmission electron 
microscope at 100 KV. 


For SEM, the pollen grains were mounted on copper stubs with a strip of 
double-sided conductive tape and air-dried. The sample was then coated with gold in 
a JEOL JFC-1600 sputter coater. Observations and digital images were collected with 
a JEOL JSM-6360LV SEM. Measurements of the polar axis (P) and equatorial 
diameter (E) were made on digital SEM images with JEOL’s Smile View software. 


1.2 Sample preparation for RNA extraction 

Transcriptome profiling was carried out on developing anthers at four landmark 
stages: microspore mother cell (MMC) stage, tetrad (TET) stage, uninucleate 
microspore (UNM) stage and bicellular pollen (BCP) stage. The four stages were 
defined on the basis of cytological observations with LM and TEM as described 
above and confirmed using DAPI staining. 


Fresh flowers were collected from three well-managed populations of H. patens 
in the SCBG. The anthers were picked out under a dissecting microscope, 
immediately frozen in liquid nitrogen and stored at -80°С until ВМА extraction. They 
were classified into four separate groups according to length and each group was 
pooled by mixing equal quantities of anthers from the three populations. 


Total RNA was extracted using pBIOZOL method. RNA quality was 
characterized by using a NanoDrop ND1000 spectrophotometer (NanoDrop, USA), 
and by determination of the RIN (RNA Integrity Number) value ( 7.3) using an 
Agilent 2100 Bioanalyzer. 


1.3 Library preparation and sequencing 

The cDNA libraries were prepared following manufacturer's instructions 
(lumina, USA). Five cDNA libraries (MMC, TET, UNM, BCP and MSA) were 
established. MSA was a mixed stage anther sample containing equal amounts of 
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anthers from each of the four stages (MMC, TET, UNM and BCP). mRNA was 
enriched using oligo (dT) magnetic beads from 5 ug total RNA (total RNA amount of 
MSA was 20 ug). To avoid priming bias during cDNA synthesis, isolated mRNAs 
were first fragmented into short pieces (about 200 bp) using RNA Fragmentation 
Reagents (Ambion, USA). The cleaved mRNA fragments were converted to 
double-stranded cDNA using random hexamer primers (Illumina) with the 
SuperScript Double-Stranded cDNA Synthesis kit (Invitrogen, USA). The 
double-stranded cDNA was purified using a QiaQuick PCR Purification kit (Qiagen, 
USA) and was processed by end-repair using End Repair Mix Reaction System 
(Beijing Genomic Institute, China) and the addition of a single adenine. The repaired 
cDNA fragments were then ligated with sequencing adapters. To select a size range of 
templates for downstream enrichment, the products of the ligation reaction were 
purified on a 2% TAE-agarose gel. Then the purified cDNA was enriched by PCR 
amplification using Primer PE 1.0 and PE 2.0 (Illumina) complementary to the ends 
of the adapters with Phusion DNA Polymerase. The library products were then 
sequenced using Illumina HiSeq'™ 2000 at the Beijing Genomic Institute, China. 


1.4 Data processing and de novo assembly 

The original data produced by the sequencer were defined as raw reads. Base 
calling accuracy was measured by the Phred quality score (Q score), which is the 
most common metric used to assess the accuracy of a sequencing platform. Q20 
stands for 99 % accuracy representing probability of incorrect base call being 1 in 100, 
while Q30 stands for 99.9 % accuracy representing probability of incorrect base call 
being 1 in 1000 (Illunima technique notes: sequencing). Filtering of raw reads was 
carried out to obtain clean reads by remove those, a, containing adaptors, b, 
containing more than 5 % unknown nucleotides, and c, showing more than 10 % of 
bases with a Phred scaled probability (Q) less than 20. 

Clean reads were assembled into unigenes using the short reads assembly 
program Trinity, release-20130225  (http://trinityrnaseq.sourceforge.net/. Тһе 
resulting unigenes were divided into two classes: clusters containing several unigenes 
that had a similarity of more than 70 %, and a singletons class (Grabherr et al., 2011; 
Iseli et al., 1999). 


1.5 Functional annotation and analysis 

To assign putative gene function, unigene sequences were firstly aligned by 
ВГАЗТх (E-value < 1е-5) to protein databases (NR, SwissProt, KEGG, COG). Then 
the unigenes were searched against nucleotide database NT (E-value < le-5) using 
BLASTx to retrieve proteins with the highest sequence similarity with the given 
unigenes and their protein functional annotations. Functional categories of the 
predicted genes were obtained by applying gene ontology (GO) terms to the NR 
database annotation using the Blast2GO program. GO functional classification for all 
unigenes and the distribution of gene functions were analyzed using WEGO software 
(Conesa et al., 2005; Ye et al., 2006). To identify possible functions, the annotated 
unigene sequences were searched in the COG database where orthologous gene 
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products are classified (Kanehisa et al., 2008). To identify active pathways in H. 
patens anthers, annotated sequences were mapped to reference pathways in the KEGG 
database (Release 63.0). COG and KEGG pathway annotations were performed using 
BLASTall software. 


1.6 Comparison with the coffee genome 

Since Coffea canephora is the only species in the Rubiaceae with a sequenced 
genome, comparison analysis was conducted between the unigenes of H. patens and 
the C. canephora CDSs using BLASTn. Sequence similarity was estimated by the 
coverage of the matched length, by calculating the value of matched sequence length 
versus unigene length and CDS length respectively. The coffee genome was 
downloaded from the following website (http://coffee-genome.org/download). 
Moreover, the expression level and the GO annotations were examined on a set of 
genes with relatively high sequence similarity to their orthologs in coffee. 


1.7 Digital Gene Expression Tag Profiling (DGE) analysis 

In order to predict putative genes correlating with tapetum and pollen 
development, four separate DGE libraries (MMC, TET, UNM and BCP) were 
established as mentioned above and analyzed. Procedures of library construction and 
sequencing were the same with the MSA library. АП clean reads from the four 
libraries were mapped onto the transcriptome library (MSA) to calculate the 
expression level for each gene and the gene coverage using SOAP (version 2.21). If 
there was more than one transcript for a gene, the longest one was used to calculate its 
expression level and coverage. The gene expression level was calculated using RPKM 
(Reads Per kb per Million reads) value. The gene coverage was determined as the 
ratio of the base number in a gene covered by unique mapping reads to the total bases 
number of the gene. RPKM value — 10 was used as a criterion to define high-level 
expressed genes. RPKM value of 10 is the average of the median for every gene plus 
5. Highly expressed genes were categorized with a Venn diagram. 


DGE was used to compare the differences in gene expression (Audic and 
Claverie, 1997). The threshold FDR < 0.001 and the absolute value of the log»Ratio 
= 1 were used to determine the potential difference in gene expression. Genes 
enriched at a specific stage were identified by comparing gene expression between 
two adjacent stages. Genes that were two-fold up-regulated and expressed at 
high-level (RPKM = 10) at the same stage were identified. Differentially expressed 
genes (DEGs) were two-fold up-regulated at only one stage or simultaneously at two 
stages or three stages, which were identified by pairwise comparison of the four 
stages of anther development. Among the two fold up-regulated genes at only one 
stage, those with high expression (RPKM value — 10) at the same stage and 
meanwhile with low expression (RPKM value X 10) at the other three stages were 
defined as specifically expressed genes (SEGs). Then we carried out GO functional 
enrichment and KEGG pathway enrichment analysis for these SEGs. Cluster analysis 


8 


of gene expression patterns was performed with Cluster (De Hoon et al., 2004) and 
Java Treeview (Saldanha, 2004) softwares. With GO enrichment analysis, first all 
DEGs were mapped to GO terms in the database (http://www.geneontology.org/), then 
the gene numbers for every term was calculated, hypergeometric test was used to find 
significantly enriched GO terms in DEGs. The calculated p-value went through 
Bonferroni Correction, taking corrected p-value < 0.05 as a threshold. 


With Nr annotation, Blast2GO program was used to get GO annotation of DEGs, 
and then WEGO software (Ye et al. 2006) was used to do GO functional classification 
for DEGs. KEGG pathway enrichment analysis was used to identify significantly 
enriched metabolic pathways or signal transduction pathways in DEGs (Kanehisa et 
al., 2008). 


1.8 Real-time quantitative PCR (q-PCR) 

qRT-PCR was performed on independently collected samples to verify the 
potentially differentially expressed genes (DEGs) identified by RNA-seq. Total RNA 
was prepared from H. patens anthers at the four developmental stages as the same for 
digital gene expression analysis. First-strand cDNA was synthesized using a 
FastQuant RT Kit (TIANGEN, China) according to the manufacturer's protocol. 
Real-time PCR primer was designed on Web Primer: DNA and Purpose Entry website 
(http://www.yeastgenome.org/cgi-bin/web-primer). Primers used in the experiment 
are listed in Supplementary Table S1. АП reactions were performed using GoTaq 
qPCR Master Mix (Promega, USA). Reactions were carried out in a total volume of 
10 uL reaction mixture containing 5.0 uL of GoTaq qPCR Master Mix (Promega), 0.2 
uL (10 umol/L) of each primer, 200 ng of template cDNA. The real time RT-PCR 
amplification was performed with LightCycler 480 | Real-Time PCR System using 
two-step cycling conditions of 95 °С for 10 min, followed by 40 cycles of 95 °С for 
15s and 60 °С for 60s. Dissociation stage condition was set at 95 °С for 15s, 60 °С for 
15s and 95 °C for 15s. The B-Actin primer was designed based on the homologous 
gene of Rubia cordifolia L. B-Actin was amplified from Н. patens anthers at the same 
four stages and used as an internal control. The relative quantities of transcripts were 
calculated using the comparative Ct method and three biological replicates were 
performed. 


2. Results 
2.1 Ultrastructural observations of anther wall and pollen development 

Anthers of H. patens contain four microsporangia. In cross sections, the 
microspore mother cells (ММС$) are angular in shape and possess a large nucleus 
with a darkly staining nucleolus (Fig. 1A). The anther wall at the MMC stage is 
differentiated into epidermis, endothecium, middle layer and tapetum. The tapetum is 
uninucleate, one or two layered and the microspore mother cells become oval prior to 
meiosis (Fig. 1A). 


After meiosis, tetrads of haploid microspores fill the anther locules (Fig. 1B). 
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Tetrads are surrounded by a thick asymmetric callose envelope and the primary cell 
wall is visible outside of the callose. The plasmalemma of the microspores is initially 
straight and in direct contact with the callose (Fig. 2A). Subsequently, a fibrillar 
surface coat develops between the callose and the plasmalemma, which has a loose, 
irregular fibrillar texture and is considered primexine matrix (Fig. 2B). Rod-shaped 
electron-dense units are radially oriented in the distal part of the primexine, which 
eventually form the columellae. At this stage, the epidermis, endothecium and middle 
layer become high vacuolated, nuclei are displaced to the wall and nucleoli are 
fragmented. The middle layer becomes flat (Fig. 1B). Tapetal cells contain numerous 
ribosomes and extensive layers of endoplasmic reticulum that lie below the cell 
membrane and surround the nuclei in concentric rings. Clusters of small vesicles are 
present throughout the tapetal cytoplasm and pre-Ubisch bodies are formed (Fig. 2C). 


At uninucleate stage, microspores are released from the tetrads into the anther 
locule, and increase both in volume and in wall thickness (Fig. 1C). The epidermis 
and endothecium are high-vacuolated and contain lipidic materials and starch grains, 
whereas the middle layer is almost degenerated. Abundant Ubisch bodies are 
produced in the tapetal cells and extruded through their inner tangential and radial 
walls (Fig. 2D). The Ubisch bodies possess an irregular and granular shape. The 
electron density of the Ubisch body wall is very similar to that of the exine. During 
the early development of free microspores, the columellae increase in thickness, 
particularly at their distal ends, the tectum develops, while the sporopollenin-like 
materials are deposited between the distal portions of the columellae. The foot layer 
occurs at the bases of the columellae (Fig. 2E). White-line-centered lamellae were 
observed between the foot layer and the endexine (Fig. 2F). The endexine continues 
to develop on the proximal surface of these white lines and abundant electron-dense 
material, probably sporopollenin, appears beneath the endexine. Subsequently, the 
columellae become solid and the intercolumellar space is filled with fibrillar materials, 
i.e., remnants of the primexine matrix. As the microspores form a large vacuole (Fig. 
2G), the plasmalemma in the interapertural regions recedes, creating a large 
pericytoplasmic space under the radially oriented membranous granular material layer 
(Fig. 2H). This event might indicate the start of intine development. At this phase, the 
tapetal cell walls and cytomembranes are completely degenerated, a large number of 
vesicles and lipids are present throughout the tapetal cells and Ubisch bodies are 
released from the surface of the tapetal cells into the locule (Fig. 21). At this phase, a 
thin layer of cuticle is present at the anther epidermis (Fig. 21). 


Initiation of intine development takes place prior to the first pollen mitosis. At 
this stage the intine has a fibrillar structure and microspores possess abundant 
organelles e.g., rough endoplasmic reticulum (RER) and a large vacuole (Fig. 3A). 
The cytoplasm contains long profiles of rough endoplasmic reticulum parallel with 
the plasmalemma and numerous mitochondria. Later the microspores enter bicellular 
pollen stage in which the generative cell is surrounded by lipid droplets (Fig. 3B). The 
cytoplasm is characterized by an abundance of compound starch grains and lipid 
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droplets (Fig. 3B). The intine thickens and appears more electron-dense and reaches 
the endexine at several sites (Fig. 3C). Initiation of the characteristic fibrous 
thickenings takes place in the endothecium (Fig. 1D). Cells of the tapetum and middle 
layer completely degenerate, leaving the endothecium as the innermost anther wall 
layer (Fig. 3D). The volume of the pollen grains progressively increases, resulting in 
stretching of the pollen wall, which is more pronounced in pollen grains at dehiscence. 
Projection of the intine in the aperture region was observed in some pollen grains (Fig. 
3D, F). Pollen grains are bicellular at anthesis, relatively small [P (polar axis) 18.36 
(17.8-19.4) um x E (equatorial diameter) 19.66 (18.7-20.9) um], and oblate spheroidal 
in equatorial view (Fig. 3E). The grains are planaperturate with three compound 
apertures. The exine pattern is micro-reticulate with the sculpting somewhat obscured 
by the presence of fibrillar material that fills the spaces between the columellae (Fig. 
3F). 


2.2 cDNA sequence generation and de novo assembly 

Approximately 948 million bases were generated and a set of 105 million 90 bp 
paired-end reads were obtained after cleaning and quality checks. Q20 and Q30 
values reached 97.8 % and 92.1 96 respectively, representing high quality sequencing. 
Assembly of reads resulted in 177,548 contigs with an average length of 354 nt. These 
were assembled into 89,849 unigenes, which were divided into two classes. The first 
class is distinct singletons (52,379) with the prefix unigene, representing unigenes 
derived from a single gene. The second class is distinct cluster unigenes with the 
prefix CL and the cluster ID, where each cluster contains unigenes with similarities of 
more than 70 %. The 37,470 cluster unigenes are distributed into 14,948 distinct 
clusters, among which, 3,263 contain one unigene (unigenes shorter than 200 bp were 
removed after family clustering), and 7,247 contain two unigenes. The remaining 
4,438 clusters contain three or more unigenes, and the largest two clusters contain 45 
unigenes. The cluster unigenes may be derived from the same gene or from a 
homologous gene (Grabherr et al., 2011). 


2.3 Gene annotation and functional classification 

In the whole unigene set, a total of 57,476 (64 95) unigenes were significantly 
matched to known genes in the public databases of NR, NT, Swiss-Prot, KEGG, COG 
and GO (Table 2), representing putative functional annotations for more than half of 
the assembled sequences. 


Homology searches performed using BLASTx against the non-redundant (NR) 
protein database showed 55,563 (61.8 %) of unigenes with matches to the NR 
database with an E-value cut-off of 1e-5 (Supplementary Fig. S1A). The similarity 
distribution shows 68.6 % of BLASTx hits are within the range 60 to 100 %, 
representing strong homology. Only 31.4 % have similarity values less than 60 %, 
showing moderate homology (Supplementary Fig. S1B). With regard to species 
similarity, the highest proportion of matched sequences in the NR database are 
derived from Solanum lycopersicon (33.9 96, 18,815 unigenes) and Vitis vinifera 
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(26.4 %, 14,667 unigenes) which belong to the Solanales, a sister order to the 
Gentianales, to which Rubiaceae belongs (APG III, 2009; Supplementary Fig. S1C). 
H. patens shows a similar proportion (27.2 96, 15,090 unigenes) of sequences 
matching the genome of Coffea canephora, which was published recently (Denoeud 
et al., 2014; Dereeper et al., 2015). Further comparative analysis between H. patens 
and C. canephora is presented in section 2.4 below. 


To understand the potential function of the assembled unigenes, the COGs 
(clusters of orthologous group of proteins) functional annotation system was used. In 
total, 18,948 (21.1 96) unigenes were mapped to the COG database and possible 
functions and statistics were predicted. The five largest categories are general function 
(17.9 96), transcription (8.9 96), replication, recombination and repair (8.6 96), signal 
transduction mechanisms (7.0 96), and posttranslational modification, protein turnover 
and chaperones (6.9 %) (Supplementary Fig. 52). The proportion of genes annotated 
by COG and the predicted categories are quite similar in other studies of non-model 
species without known genomic sequences in the past a few years; for instance, 20.4 % 
in Triadica sebifera in Euphorbiaceae (Divi et al., 2016), 20.1 % in Uncaria 
rhynchophylla in Rubiaceae (Guo et al., 2014) and 21.196 in Dendrocalamus latiflorus 
in Gramineae (Zhang et al., 2012). 


GO assignments were used to classify the functions of predicted genes, resulting 
in 42,244 (47.0 96) unigenes assigned to at least one GO term, while 47,605 (53.0 96) 
could not be assigned. Among the 42,244 BLASTable (E-value « le-5) unigenes, 
17,923 unigenes are distinct singletons, while 24,321 are distinct cluster unigenes, 
distributed in 9,639 clusters. If the unigenes contained in the distinct clusters were 
treated as single unigenes, then the total number of annotated unigenes would be 
27,562, and the annotated proportion would be 41.0 %, which is close to the 
proportion of unigenes in the same cluster treated as independent unigenes. 
Non-BLASTable (E-value > 1e-5) sequences have been reported in all studied plant 
transcriptomes, with the proportion varying from 13.0 % to 80.0 %, depending on the 
species, the sequencing depth and the parameters of the BLAST search (Parchman et 
al., 2010; Blanca et al., 2011; Ness et al., 2011; Zhang et al., 2012; Zhang FJ. et al., 
2014). Non-BLASTable (E-value > 1e-5) sequences might result from biological 
factors, including rapidly evolved genes with divergent sequences, species-specific 
genes and the persistence of non-coding fractions derived mainly from untranslated 
regions (Zhang et al., 2012; Logacheva et al., 2011). The BLASTable (E-value < 
1е-5) unigenes are divided into three GO categories: biological process (32,700), 
cellular component (33,751), and molecular function (31,293). In the biological 
process category, the most abundant sequences are classified into cellular process 
(81.0 95), metabolic process (77.0 96) and single-organism process (58.4 96). In the 
cellular component category, cell (93.6 96), cell part (93.6 96), and organelle (74.6 96) 
are the three most represented GO terms. In the molecular function category, catalytic 
activity (66.7 96), binding (63.6 96), and transporter activity (10.0 96) are highly 
represented (Supplementary Fig. S3). The constitution of the most represented GO 
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terms shows a high degree of similarity with the GO annotation of C. canephora 
(Dereeper et al., 2015). 


To identify biological pathways activated during anther development of H. 
patens, the assembled unigenes were annotated against the KEGG database (E-value 
< 1е-5). A total of 31,169 (34.7 96) unigenes were mapped into 128 pathways in the 
KEGG database (Supplementary Table S2). The maps with the highest unigene 
representation are metabolic pathways (6,975, 22.4 96), followed by biosynthesis of 
secondary metabolites (3,465, 11.1 96), plant-pathogen interaction (1,796, 5.8 %) and 
plant hormone signal transduction (1,609, 5.2 96). This constitution exhibits a high 
degree of similarity with rice anther transcriptome profiles derived from microarray 
analysis (Deveshwar et al., 2011). 


2.4 Comparison with the coffee genome 

Alignments of the H. patens unigenes (89,849) with the coffee genome (25,574 
CDSs) revealed 15,090 (16.8 %) unigenes with a significant overlap to 7,313 (28.6 96) 
coffee CDSs. This proportion is lower than expected given that both species are from 
the same family. When clean reads (105 million) were mapped to the coffee genome 
allowing a maximum 2 bp mismatch, a total of 366,227 (0.35 96) reads matched 4,047 
(15.8 96) coffee CDSs, a lower proportion than observed for unigene alignments. 
Meanwhile, the number of reads with no mismatches was 8,821, corresponding to 286 
coffee CDSs, and 79,220 reads with a 1 bp mismatch corresponding to 1,549 coffee 
CDSs. It is predictable that as mismatched bases increase, more matched reads and 
coffee CDS will result. Read alignment is presumed to be more accurate, since the 
reads represent the most comprehensive data. The low proportion of matches between 
H. patens and C. canephora might be explained by the fact that these species are not 
genetically close and are members of distant tribes in different subfamilies. 


Among the 7,313 coffee CDSs, 3,774 have just one matching unigene, while the 
remaining 3,539 match multiple unigenes. In these cases, most of those matching 
unigenes are from the same cluster and are longer than the matched coffee CDS. This 
is indicative of sequence similarities other than matches with different portions of the 
same coffee CDS. Among these aligned unigenes of H. patens, 80 (0.5 96) are related 
to anther wall and pollen wall development. The remaining unigenes are largely 
annotated with response to salt stress, DNA-dependent regulation of transcription, 
oxidation-reduction process, response to cadmium and protein phosphorylation, which 
represent general functions (Supplementary Table S3). 


2.5 Digital Gene Expression Tag Profiling (DGE) 

The numbers of unigenes mapped by clean reads in each of the four libraries are 
70,735 (MMC), 71,510 (TET), 71,248 (UNM) and 70,288 (BCP). Among these, 
29,002 (41.0 %), 28,660 (40.8 96), 26,806 (37.6 96) and 26,920 (38.3 96) genes have 
RPKM values > 5 in MMC, TET, UNM and ВСР, respectively. Detecting low 


13 


abundance transcripts is one of the advantages of RNA-seq. In this study, 14,059 
(19.9 96), 13,731 (19.2 96), 15,399 (21.6 %) and 15,617 (22.2 %) genes are low 
abundance in MMC, TET, UNM and BCP respectively with RPKM < 1. Genes 
with RPKM = 10 at one or more stages were considered highly expressed and the 
numbers detected at MMC, TET, UNM and ВСР stages were 18,593 (26.3 %), 18,129 
(25.4 96), 16,995 (23.9 %) and 17,370 (24.7 96) respectively (Fig. 4). It is supposed 
that unigenes with higher expression are more likely to be full-length, therefore 
further analyses was focused on genes with RPKM values = 10 at one or more 
stages. These are classified into four main categories and 15 sub-categories based on 
expression pattern. The four main categories are highly expressed (RPKM value — 
10) in one, two, three or in all four stages. The numbers of genes of each sub-category 
are calculated and illustrated in the Venn diagram (Fig. 5). In total, 11,570 genes are 
highly expressed across all stages, which may suggest their involvement in 
housekeeping functions or general metabolism. 


2.5.1 Identification of up and down-regulated genes and their GO annotations 

The comparisons of adjacent stages, identified 2,851, 2,491 and 4,781 genes that 
were potentially up-regulated by at least two-fold, while 3,443, 3,591 and 4,305 genes 
were potentially down-regulated in TET, UNM and BCP stage, respectively. A greater 
number of genes were down-regulated in comparison to those up-regulated in TET 
and UNM, however, this trend is reversed in BCP where a larger proportion of genes 
showed up-regulation (Fig. 6). Among these putatively up-regulated genes, those with 
high expression (RPKM value = 10) in one particular stage and lower expression 
(RPKM < 10) in the other three stages, were considered stage-enriched. 927 (32.5 %), 
454 (18.2 %) and 2,068 (43.3 96) genes are enriched in TET, UNM and ВСР, 
respectively (Fig. 6). BCP has the largest proportion of stage-enriched genes in all the 
stages analyzed, moreover, both up and down-regulated genes increase dramatically 
in BCP in comparison to earlier stages, highlighting the distinct gene expression 
profile at this stage. 


Functional association of the potentially up-regulated and stage-enriched genes 
based on Gene Ontology (GO) annotations (Supplementary Table S4) revealed that 
TET stage-enriched and up-regulated genes were enriched in protein phosphorylation, 
metabolic process, oxidation-reduction process, histone H3-K9 methylation, DNA 
methylation, DNA-dependent regulation of transcription. UNM stage-enriched and 
up-regulated genes were enriched in oxidation-reduction process, DNA-dependent 
regulation of transcription, proteolysis, metabolic process, transmembrane transport 
and response to cadmium ion. Categorization of BCP stage-enriched and up-regulated 
genes шю GO functional groups showed enrichment for genes involved in 
oxidation-reduction process, pollen tube growth, plant-type cell wall modification, 
protein phosphorylation and transmembrane transport, which could be important 
contributors to pollen maturation and the pollen transcriptome. However, a large 
number (38,440) of BCP expressed genes have not been previously reported, which 
could serve as a useful resource to mine transcripts for validation of putative 
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gametophyte and/or germline functions. 


Genes with GO terms (Supplementary Table S5) related to anther wall and pollen 
wall development were further selected (Fig. S4). Among the putatively up-regulated 
genes in TET stage, 40 (46.5 %) are related to pollen exine development, 40 (46.5 96) 
are related to other components of, or contributing to, pollen wall development and 6 
(7.0 96) are related to tapetum development. Among the stage-enriched genes in TET 
stage, 6 (35.3 96) are related to pollen exine development, 9 (52.9 %) to pollen wall 
development and 2 (11.8 96) to tapetum development. At the UNM stage, 11 (29.0 96) 
putatively up-regulated genes are related to pollen exine development and 27 (71.0 96) 
are related to other components of, or contributing to, pollen wall development, but 
none to tapetum development. In addition, there is 1 stage-enriched gene 
(Unigene6816) related to pollen exine development and 1 (Unigene9576) contributing 
to hemicellulose metabolic process, but none contributing to tapetum development. At 
the BCP stage, 50 (21.4 96) putatively up-regulated genes are related to pollen exine 
development, 180 (76.9 95) to other components of or contributing to pollen wall 
development and four (1.7 96) to tapetum development. At BCP stage, 20 (24.1 96) 
stage-enriched genes are related to pollen exine development, 63 (75.9 %) to other 
components of or contributing to pollen wall development and none related to 
tapetum development. From TET stage onwards, the genes related to pollen wall 
development first decrease in UNM then increase in BCP, while the genes related to 
tapetum development decrease in both UNM and BCP. This trend is consistent with 
developmental events in pollen wall and tapetum. 


2.5.2 Identification of putatively stage-specific expressed genes and their 
expression profiles 

Pairwise comparisons within the four DGE libraries identified 7,720 genes are 
differentially expressed across the four stages. These genes were categorized into 14 
groups based on up or down regulation. Hierarchical cluster analysis was performed 
on each of the groups to illustrate the results (Fig. 7). 

In group 1 to 4, the genes are at least two-fold up-regulated in one particular stage 
comparing to any of the other three stages and the differences in expression level are 
not significant. In total, 1,644 (21.3 96), 1,637 (21.2 96), 688 (8.9 %) and 3,220 
(41.7 96) genes illustrated in the groups 1 to 4 show expression peaks in MMC, TET, 
UNM and ВСР, respectively. Gene Ontology (GO) annotation analysis of the 
differentially expressed genes (Supplementary Table 56) indicated that a large number 
of genes were involved in oxidation-reduction processes, protein phosphorylation, 
pollen tube growth, response to salt stress and response to cadmium ion. A total 
number of 86 genes (1.1 96) in the groups 5 and 10 show two-fold up-regulation in the 
former or the latter two stages, coinciding with the time of tapetum formation or 
tapetum PCD. These genes participate in oxidation-reduction processes, metabolic 
processes and DNA-dependent regulation of transcription. Genes in group 7 are 
down-regulated at tetrad and uninucleate microspore stages, while genes in group 8 
are two-fold up-regulated at these two stages. The genes in both groups were mainly 
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enriched for protein ubiquitination. Genes in group 6 are down-regulated at tetrad and 
bicellular pollen stages, whereas group 9 genes are two-fold up-regulated at these 
stages. Group 9 genes are involved in metabolic processes, sodium ion 
transmembrane transport, regulation of pH, pollen tube growth and plant-type cell 
wall modification. In groups 11 to 14, a total of 427 (5.5 %) genes are two-fold 
down-regulated in one particular stage. The genes in these 4 groups were mainly 
related to  oxidation-reduction process, ATP catabolic process, translation, 
DNA-dependent regulation of transcription. In total 3,427 (44.4 96) genes in all 14 
groups were not annotated by GO terms and as such, deserve further attention as a 
source of genes with unidentified roles in anthers. 


Groups 1 to 4 were further analyzed to identify stage specifically enriched genes. 
They were defined as two-fold up-regulated and highly expressed (RPKM = 10) in 
one particular stage, while expression values in the other three stages were less than 
10. As a result, 3,769 stage-specifically enriched genes across the four stages were 
obtained. The numbers at each stage were, 776 (20.6 96), 807 (21.4 96), 322 (8.5 96) 
and 1,864 (49.5 %) at MMC, TET, UNM and ВСР respectively (Fig. 8). UNM had 
the lowest share of stage specifically enriched genes, while BCP had the largest share. 
Thus, BCP showed the most diverse expression profiles compared to the other three 
stages. 


The expression patterns of all stage specifically expressed genes were analyzed 
and genes with similar expression profiles were clustered. A total of 14 clusters were 
generated (Fig. 9). Genes in the same cluster are considered co-expressed and may be 
targets of the same transcription factors. The expression profile of the largest cluster 
(“а”; 1,487 genes) showed up-regulation in MMC and TET followed by 
down-regulation in UNM and BCP, coinciding with the pattern of tapetum 
development. Further, GO terms relating to tapetum development are strongly 
enriched in cluster “а” (Supplementary Table S7) confirming the cytological changes 
of tapetum development from MMC to BCP involve gene down-regulation. Clusters 
“b” (242) and “о” (132) also exhibit down-regulation from MMC to BCP and are 
dominated by genes related to oxidation-reduction process, DNA-dependent 
regulation of transcription and metabolic process. Clusters “]” (246), “m” (237) and 
“n” (375), show up-regulation trends from MMC to BCP and are dominated by genes 
related to oxidation-reduction process, pollen tube growth, plant-type cell wall 
modification and cellular membrane fusion. In clusters “а” (102), “с” (95) and “Р” 
(90), genes were up-regulated at both or either of the TET and UNM stages and 
annotated by lipid metabolic process, very long-chain fatty acid metabolic process, 
transmembrane transport and intracellular signal transduction. These features coincide 
with the finding that tapetal cells are most active during TET and UNM stages and are 
known to be involved in the synthesis of flavonoids and other secondary metabolites 
that eventually are transported to developing microspores. In clusters “1” (135), 41” 
(199) and “h” (332), genes are down-regulated at both or either of the TET and UNM 
stages and annotated by pollen tube growth, oxidation-reduction process, plant-type 
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cell wall modification and negative regulation of programmed cell death. Genes in 
cluster *k" (66), showing up-regulation at TET and BCP stages, are mainly involved 
in xylan biosynthetic process, glucuronoxylan metabolic process, plant-type cell wall 
modification, cellulose biosynthetic process and plant-type cell wall biogenesis. 
Coincidently, callose wall, primexine and intine related to polysaccharide metabolism 
are synthesized mainly at TET and BCP stages. Genes down-regulated at TET and 
ВСР stages are clustered in “е” (31), and are enriched in ATP catabolic process, 
metabolic process, transmembrane transport and DNA-dependent regulation of 
transcription. 


2.5.3 Candidate genes relevant to tapetum and pollen wall development 
Differentially expressed genes and specifically expressed genes enriched for the 
relevant GO terms of anther wall and pollen wall development were identified. In 
total 243 differentially expressed genes and 108 stage-specific genes were obtained 
(Fig. 55). Genes related to sporopollenin biosynthetic processes, are only enriched at 
TET (6) and UNM (1) stages. Genes related to pollen exine formation, are mainly 
enriched at TET (26 genes) and BCP (35 genes) stages, among which 5 (19.2 96) and 
19 (54.3 %) genes are specifically expressed. Only 1 gene related to tapetal layer 
morphogenesis is specifically expressed at TET stage. There are 4 and 3 genes related 
to tapetal layer development differentially expressed at MMC and TET stages, 
respectively. Among those genes, 3 are specifically expressed at MMC and 1 is 
specifically expressed at TET stage. Genes relating to pectin are mainly enriched at 
MMC and ВСР stages, with 14 (MMC) and 26 (ВСР) genes being differentially 
expressed and 11 of which are specifically expressed at BCP stage. Genes relating to 
cellulose mainly express at MMC, TET and BCP stages, with the numbers of 
differentially expressed genes being 15, 12 and 41, and the numbers of specifically 
expressed genes as 5, 4 and 20 respectively. Genes corresponding to hemicellulose 
metabolic processes are only expressed at TET, UNM and BCP stages, whilst 88.4 % 
(38 genes) of these are differentially expressed at BCP stage and among these genes, 
30 are specifically expressed. These findings are consistent with the cytological 
changes and developmental events in anther wall and pollen wall development. 


Pollen wall development requires lipid and polysaccharide metabolism, therefore 
genes enriched in carbohydrate and lipid metabolism pathways were analyzed (Fig. 
S6). The numbers of differentially expressed genes in both pathways only slightly 
change between MMC and UNM stages, while a marked increase occurs at BCP stage. 
This trend emphasizes the distinct gene expression profile at BCP stage. At MMC and 
UNM stages, a greater number of stage-enriched genes are involved in lipid 
metabolism in comparison to those involved in carbohydrate pathways 
(Supplementary Table S8). At TET and BCP stages, this trend is reversed, where 
fewer stage-enriched genes are involved in lipid metabolism than in carbohydrate 
metabolism. The MMC stage is enriched with a greater number of genes relating to 
glycerophospholipid and lipid metabolism, while TET stage is enriched in starch and 
sucrose metabolism as well as glycerophospholipid metabolism. UNM is enriched 
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with the greatest number of genes relating to cutin, suberine and wax biosynthesis. 
The first three largest numbers of genes at BCP are related to starch and sucrose 
metabolism, pentose and glucuronate interconversions, and glycerophospholipid 
metabolism. This might suggest nutrient synthesis and storage at BCP stage, 
consistent with the observation that the bicellular pollen cytoplasm is characterized by 
an abundance of starch grains and lipid droplets. 


2.6 Verification of expression profiles of selected unigenes related to tapetum and 
pollen wall development by qRT-PCR 

Real-time quantitative PCR (qRT-PCR) analysis was carried out on 20 candidate 
genes selected at random (Fig. 10). Among these, 14 genes show high expression with 
RPKM 2 10 in at least one stage, while 6 display low expression with ЕРКМ < 
10 at all four stages. The expression values at developmental stages derived from 
RNA-seq and q-PCR of each gene are listed in Supplementary Table S9. The 
expression patterns of all 20 genes analyzed by qRT-PCR largely agree with the 
digital gene expression tag profiling (DGE), as the correlation co-efficients (r) are all 
greater than 0.9. This result provides support for the reliability of RNA-Seq data and 
the differential gene expression profiles observed for anther stages of H. patens. 


3. Discussion 
3.1 Tapetum and pollen development display common features in Rubiaceae 
Anther development is a complex process which involves transition. from 
sporophyte to gametophyte, control of mitotic and meiotic cell divisions, together 
with the coordination of pollen and anther maturation (McCormick, 1993; Scott et al., 
2004; Zhang et al., 2011). Ultrastructural changes involved in anther development 
have been described for many angiosperms, especially for model species including 
Arabidopsis and rice (Sanders et al., 1999; Zhang and Wilson, 2009; Zhang et al., 
2011), although there are differences in the resolution and detail of the stages 
described (Owen and Makaroff, 1995). In our study, anther and pollen development of 
H. patens was shown to be typical of many angiosperms, with some features common 
in the Rubiaceae family. Our observations of developmental events and cytological 
changes provides a solid foundation for the analysis of gene expression profiles in the 
developing anthers of H. patens. 


The tapetum cells in H. patens maintain integrity and position indicating that the 
tapetum is of the secretory type, which fits the description of a type five tapetum 
(Pacini, 1997), the most common in angiosperms (Owen and Makaroff, 1995). Ubisch 
bodies occurred simultaneously with the developing pollen exine, which corroborates 
the idea that they are required for transferring tapetum-derived sporopollenin 
precursors to the exine (Huysmans et al., 1998). The tapeta of Arabidopsis and rice 
are also secretory but use different export routes for sporopollenin. In Arabidopsis, 
specialized organelles, elaioplasts and tapetosomes, are supposed to be transporters. 
Lipids are transported from the tapetum to microspores in two ways: vesicular 
transport and use of transporters as lipid carriers (Vizcay-Barrena and Wilson, 2006). 
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In rice and other cereals, Ubisch bodies are thought to export sporopollenin across the 
hydrophilic cell wall to the locule (Zhang et al., 2011). Thus, the tapetum in dicots 
and monocots might have different mechanisms for sporopollenin translocation. 
Analogies are often found between the ornamentation of the pollen exine and that of 
the Ubisch body wall (Hesse, 1986; Vinckier and Smets, 2002, 2003). Ubisch bodies 
are considered have great potential as model system to study sporopollenin deposition, 
since they are acellular structures, independent of cytoplasmic control (Clément and 
Audran, 1993; Verstraete et al., 2014). To date, only the RAFTIN gene was identified 
in pro-orbicule bodies and shown to accumulate in Ubisch bodies in wheat and rice. 
RAFTIN is highly anther-specific and essential for pollen development in cereals 
(Wang et al., 2003). Future genetic studies to identify the genes and gene products that 
control sporopollenin polymerisation and exine patterning will need to take Ubisch 
bodies into account when screening for phenotypes (Verstraete et al., 2014). 


The fundamental structure of pollen wall in angiosperms consisting of the outer 
exine and the inner intine is generally conserved (Heslop-Harrison, 1968; Blackmore 
et al., 2007; Ariizumi and Toriyama, 2011). The components contributing to the pollen 
wall are produced and accumulated in a precise temporal sequence. Sporopollenin 
makes up the majority of the material of the exine (Blackmore et al., 2007). In H. 
patens, the sporopollenin deposition starts from TET stage. Intine development starts 
from the late UNM stage and is complete by the end of BCP stage. The ontogenetic 
sequence of pollen wall development follows the basic scheme in the family (Hansson 
and El-Ghazaly, 2000; El-Ghazaly et al., 2001). Intinous projections occurred in the 
aperture region of H. patens pollen grains, known as protruding onci or pollen buds in 
the Rubiaceae, was believed to be a relatively common feature in this family (Dessein 
et al., 2005; Kuang et al., 2012). 


Besides exine and intine, the callose wall and primexine are successively 
synthesized and degraded at precise times. The callose wall, consisting of linear 
p-1,3-glucan polymers (Hird et al., 1993), begins to deposit at the beginning of TET 
stage and degrades at the end of TET stage in H. patens. The primexine, which acts as 
a template that guides the accumulation of sporopollenin, is composed of neutral and 
acidic polysaccharides, cellulose and some proteins (Heslop-Harrison, 1968; Scott et 
al., 2004; Jiang et al., 2013). Primexine matrix usually with fibrillary texture appears 
between the callose and the plasmalemma at the TET stage in H. patens. In addition, 
lipidic material, which may be classed as pollenkitt (El-Ghazaly et al., 2001; Edlund 
et al., 2004), is deposited onto the exine surface at the late ВСР stage. These 
ontogenetical features were also described in the available ultrastructural studies in 
Rubiaceae (Hansson and El-Ghazaly, 2000; El-Ghazaly et al., 2001; Kuang and Liao, 
2015). 


3.2 DEGs expression patterns agree with the developmental events 


Genes related to sporopollenin biosynthetic process were closely associated 
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with fatty acid metabolism and only expressed at TET and UNM stages, consistent 
with the appearance of pro-Ubisch bodies and Ubisch bodies in tapetal cells. Genes 
related to pollen exine formation are enriched at TET and BCP stages, which may 
correspond to the time that primexine formed and exine deposited. Genes related to 
hemicellulose metabolic process, are mainly enriched at BCP stage, a few are 
enriched at TET and UNM stages and none are expressed at MMC stage, coinciding 
with the stages at which the primary cell wall and intine develop. Genes related to 
pectin are largely enriched at BCP stage, and genes related to cellulose are largely 
enriched at TET and BCP stages. These gene expression patterns agree well with the 
observed stages in which intine and primexine appear in H. patens. Genes related to 
tapetal layer morphogenesis and development are enriched at MMC and TET stages, 
consistent with stages during which tapetal cells are differentiated and pro-Ubisch 
bodies are produced. 


The numbers of genes enriched in different pathways are correlated with 
structures and components present at different anther developmental stages. At UNM 
stage, abundant Ubisch bodies are synthesized, which involve lipid metabolism. 
Consistently, more genes are revealed in lipid metabolism pathways than in 
carbohydrate metabolism pathways at the same stage. The synthesis and degradation 
of callose wall and primexine, and the formation of intine are involved in 
carbohydrate metabolic pathways (Jiang et al., 2013). The genes related to the 
carbohydrate metabolism are enriched at the stages during which callose wall, 
primexine and intine form. In addition, the cytoplasm of mature H. patens pollen 
grains is characterized by an abundance of compound starch grains and lipid droplets 
at BCP stage. This may account for the increase in the number of genes enriched in 
both carbohydrate and lipid metabolism pathways at this stage. 


GO enrichment analysis of the stage-enriched genes and differentially expressed 
genes suggested the gene number, gene function and expression profile were 
significantly correlated with anther developmental processes in H. patens. The genes 
filtered from KEGG pathways analysis and GO enrichment analysis, provide a useful 
resource to identify novel genes and their possible functions in pollen wall 
development. 


3.3 Comparative analysis of anther and pollen transcriptomes confirms 
inter-specific conserved patterns 

Developmental transcriptomic analysis of anthers and pollen grains utilizing 
microarray platforms have been reported in Arabidopsis (Honys and Twell, 2004; Pina 
et al., 2005), rice (Huang et al., 2009; Fujita et al., 2010; Deveshwar et al., 2011), 
maize (Ma et al., 2008), and Brassica rapa (Dong et al., 2013). Our study revealed 
89,849 unigenes in the H. patens anther transcriptome, well above the numbers of 
genes predicted to be expressed in anthers of other species. However, high-throughput 
sequencing techniques increase the chance of detecting transcripts with low 
abundance and microarrays have been reported to be less effective at identifying low 
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abundance transcripts (Loraine et al., 2013). Further, the number of H. patens 
unigenes is an estimate, since the genome is unknown and unigenes sharing sequence 
similarities were classified as different unigenes, in consideration of alternative 
splicing events. High numbers of unigenes have also been reported for species with no 
reference genomes such as for the fruit transcriptome of Triadica sebifera with 92,550 
unigenes (Divi et al. 2016), the embryo transcriptome of chrysanthemum with 
116,697 unigenes (Zhang FJ. et al., 2014) and the leaf transcriptome of Lycium 
chinense with 61,595 unigenes (Wang et al. 2015). Finally, each developmental stage 
of H. patens anther contains about 20 % of transcripts with RPKM = 1, which most 
likely would not be detected with microarrays. These low abundance transcripts 
provide a resource for future analyses. 


Transcriptomic analysis in developing and germinated pollen of rice displayed a 
“U-type” change in the number with the lowest number at the bicellular pollen stage 
(Wei et al., 2010). Developmental analysis of the Arabidopsis pollen transcriptome 
revealed a major shift in mRNA populations between BCP and TCP stages, reflective 
of the transition from earlier cell division to later pollen maturity (Honys and Twell, 
2004). This distinct phase shift suggests inter-specific conservation of 
pollen-expressed genes. In H. patens anthers, BCP stage has the greatest number of 
stage-specific genes. The divergence between H. patens and the two model species 
might result from the relative contribution of sporophytic and gametophytic cell types 
and earlier developmental stages (MMC and TET) in H. patens together with 
taxon-specific features, such as insect-pollination and bicellular pollen at maturity. 
Moreover, different platforms, tissue collections and comparison methods can result 
in significant variation in anther transcriptome profiles for different species 
(Hollender et al., 2014). 

Since sample staging of anther development is similar between H. patens and 
rice in the study of Deveshwar et al (2011), additional comparative analysis was 
carried out. Transcriptome profiling of rice was investigated in anthers at pre-meiotic 
(PMA), meiotic (MA), single-celled microspore (SCP) and tri-nucleate pollen (TPA) 
stages (Deveshwar et al., 2011). At PMA and MA stages (MMC in H. patens), there 
was high representation of sporophytic tissue compared to gametophytic tissue. This 
characteristic coincided with the phenomenon that most of the transcriptome changes 
corresponding to the sporogenous tissue and developing tapetum. TPA anthers 
contained a relatively higher cellular mass of gametophytic tissue (pollen), meanwhile 
prominent differences were found in transcript composition comparing with 
transcriptomes of earlier anther stages. In H. patens, BCP stage anthers were the most 
distinctive of the four stages, with the highest number of stage-specific and 
differentially-expressed transcripts (Fig. 8), showing great similarity to the TPA stage 
in rice (Deveshwar et al., 2011). This common shift, from free microspore-stage 
anthers to late stage anthers, could be explained by the distinctive transcriptomes of 
developing pollen and sperm. It also reflects the large number of BCP enriched genes 
in H. patens associated with pollen development and maturation. Comparison of gene 
expression between two adjacent stages of anther development further highlighted this 
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major switch of gene expression. The apparent transition from the free microspore 
stage to the BCP stage in H. patens also reflected the switch from the sporophytic to 
the gametophytic programme (Deveshwar et al., 2011). 


3.4 Homologs of anther and pollen-expressed transcription factors were revealed 

Putative homologs of known transcription factors expressed in developing 
anthers were identified in H. patens (Supplementary Table S10). The expression 
patterns of these genes were largely consistent with those of their counterparts in 
other species, supporting the conservation of regulatory features in anther and pollen 
development in H. patens. 

In the early stages of Arabidopsis anther development, DYTI is required for 
tapetum development and function (Zhang et al., 2006), directly regulating TDF/ 
expression in the tapetum (Feng et al., 2012). The highest expression of both DYTI 
and ТРЕ! occurred at similar stages of tapetal development (Gu et al. 2014). 
CL1502.contig] of Hamelia patens shares high amino acid sequence identity with 
DYTI in the ӘНІН domain. Unigene29472 is a putative homolog of TDFI. Both 
CL1502.contig] and Unigene29472 show expression peaks at MMC stage, weak 
expression at TET or/and UNM stages, but no expression at BCP stage. AtMYB103, 
another important transcription factor for tapetum function and male fertility, is also 
regulated by БҮТІ (Feng et al, 2012). A candidate homolog of А/МҮВ103, 
CL6214.contigl, shows peak expression at MMC stage and lower expression at TET 
stage, but no expression at UNM and BCP stages. The expression pattern of these 
three putative homologs correspond to the early stages of tapetum development and 
callose dissolution, and exhibit somewhat consistent patterns with those of DYTI, 
AtMYB103 and TDF1, which suggests shared pathways in the regulatory network 
controlled by DYT1 in Arabidopsis and H. patens. 

CL4992.contig3 shares high amino acid sequence identity with the MYB 
domains of А/МҮВ97, AtMYB101 and AtMYB120. It shows expression peak at BCP 
stage, indicating a consistent profile with these three MYB genes. DUO POLLENI 
(DUOI) is an Arabidopsis male germline-specific R2R3-type MYB transcription 
factor (Rotman et al., 2005; Durbarry et al., 2005) that is essential for male germ cell 
division and differentiation (Brownfield et al. 2009). CL1108.Contig1, homologous to 
AtDUOI, shows peak expression at BCP stage, consistent with that of AtDUO1. It is 
likely that some of the pathways regulated by MYB factors are shared by Arabidopsis 
and H. patens. 

The AtMIKC* transcription factors (TFs) AGAMOUS-LIKE30 (AGL30), 
AGL65, AGL66, AGL94 and AGL104 are mainly expressed in mature pollen and 
play an essential role in transcriptional regulation during late pollen development 
(Verelst et al., 2007a; Verelst et al. 2007b). In H. patens, Unigene32598 and 
Unigenel629 share high amino acid sequence identity in the MIKC* domain, 
representing probable homologs of MIKC* transcription factors. Both of these are 
predominantly expressed at BCP stage, consistent with the expression patterns of 
AtMIKC* genes, which indicates conserved regulatory features of late pollen 
development and pollen maturation in H. patens. 
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3.5 Expression profiles of homologs with verified conserved patterns in pollen 
development 

The H. patens homologs of known genes with characterized functions in exine 
formation in Arabidopsis or rice were identified (Supplementary Table S10). Their 
expression profiles were analysed and verified by Q-PCR. TKPRI, which is involved 
in a conserved biosynthetic pathway in sporopollenin monomer biosynthesis, shows 
peak expression at tetrad stage (Grienenberger et al. 2010). Unigene32349, a 
homolog of TKPRI, is putatively highly expressed at TET stage, consistent with the 
expression profile of TKPRI. LAP6 and LAPS are specifically expressed during the 
period of exine synthesis and are essential for the exine production (Dobritsa et al., 
2010). The homologous unigenes CL12695Contigl апа CL3181Contigl show peak 
expression at TET and UNM stages. LAP3 is essential for pollen development and 
proper exine formation (Dobritsa et al., 2009), and the homologous Unigene27400 
shows peak expression at TET stage. Unigene34182 shows peak expression at TET 
and UNM stages and represents the H. patens homolog of MS2, which is involved in 
exine development and is expressed in tapetum shortly after microspore release (Aarts 
et al., 1997, Gómez et al. 2015). Unigene33351 and CL9729-Contig3 show higher 
read counts at TET stage and are homologous to the gene ABCG26 / WBC27. 
АВСС26 is expressed specifically in tapetal cells at the early vacuolate stage and 
plays a crucial role in the transfer of sporopollenin lipid precursors from tapetal cells 
to anther locules (Choi et al., 2011), while WBC27 is expressed during early stages of 
anther development (Xu et al., 2010). 


Unigenes homologous to known genes related to tapetum development, intine 
development and callose synthesis were also identified in H. patens (Supplementary 
Table S10). Unigene30846, homologous to AtUSP (Schnurr et al., 2006), is suggested 
to be involved in intine development and putatively highly expressed at BCP stage, 
consistent with the period of intine accumulation. Moreover, Unigene30846 matches 
Сс10 g13640 of coffee and shares the functional GO annotation of UDP-sugar 
pyrophosphorylase, which suggests functional similarity. Unigene29363 is 
homologous to CalS5, which encodes a callose synthase (Dong et al., 2005). The 
annotation of Unigene29363 (0052543 // callose deposition in cell wall, 0006075 // 
(1->3)-beta-D-glucan biosynthetic process and 0009556 // microsporogenesis) and 
peak expression at TET stage are consistent with the function and expression pattern 
of CalS5. RTS2, a unique gene in the rice genome, is required for tapetal development 
and is predominantly expressed during meiosis (Luo et al., 2006). Unigene8854 is 
homologous to RTS2, and shows peak expression at TET stage corresponding to that 
of RTS2. 

Collectively, the predicted functions and verified expression patterns of the 
aforementioned unigenes are consistent with those of their homologues in Arabidopsis 
and/or in rice. This phylogenetic conservation of gene expression further validates our 
analysis of the H. patens anther transcriptome. 
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3.6 Potential genes for tapetum and pollen wall development provide useful 
resources for future study 

Currently most characterized genes involved in tapetum or pollen wall 
development were identified based on genetic analysis of male sterile or reduced 
fertility mutants, including those with defective pollen wall development. 
Transcriptomic profiling is expected to expand knowledge of the genes involved in 
these developmental processes. In particular, our analysis of the H. patens anther 
transcriptome has allowed the identification of many more candidate genes involved 
in pollen development. We identified 243 differentially expressed genes and 108 
stage-specific genes potentially related to tapetum layer morphogenesis and 
development, sporopollenin biosynthesis, exine formation, cellulose and pectin 
metabolism and biosynthesis, and hemicellulose and cellulose metabolism. The most 
significant alignments among these classes of genes derive from Solanum 
lycopersicon, Vitis vinifera and Coffea canephora. Moreover, some potential 
orthologues in C. canephora have similar annotations relevant to pollen development, 
for instance, callose synthase and cellulose synthase. Low-abundance transcripts 
expressed during anther development were also mined and differential expression 
patterns of genes in anthers and pollen were uncovered. This wealth of information 
lays the foundation for higher resolution genome-wide transcriptomic profiling of H. 
patens, functional investigation of the identified candidate genes, and the evo-devo 
exploration of angiosperm pollen. 
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Tables 
Table 1 Classification of flowers and anthers for categorization of anther 
developmental stages. 


Pollen developmental stage Flower length (cm) Anther length (cm) 


Microspore mother cell stage 0.30 - 0.80 0.20 - 0.45 
Tetrad stage 0.90 - 1.00 0.50 - 0.60 
Uninucleate microspore stage 1.10 - 1.60 0.60 - 0.85 


Bicellular pollen stage 1.80 - 2.70 0.91 - 1.01 
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Table 2 Annotation Results of H. patens Unigenes 
Sequence File NR NT Swiss-Prot KEGG COG GO ALL 


55,563 46,601 35,244 31,169 18,948 42,244 57,476 


H.patens-Unigene 618%) (519%) (392%) (347%) (21.1%) (47.0%) (640%) 


Figure legends 

Figure 1 Semi-thin sections of H. patens anthers. 

(A) Microspore mother cell stage, showing the cell layers of the anther wall and the 
microspore mother cells. 

(B) Tetrad stage, the tapetal cells become vacuolated, the middle layer flattened. 

(C) Uninucleate microspore stage, showing free microspores and cell layers of the 
anther wall. 

(D) Bicellular pollen stage, showing pollen grains and fibrous thickenings in the 
endothecium. 

Scale bars in A, B, C and D represent 25 um. 


Figure 2 Ultra-thin sections of H. patens anthers. 

(A) Tetrad stage, the tetrads encased by a communal callose wall. 

(B) Tetrad stage, primexine matrix forms between the callose wall and the 
microspore. 

(C) Tetrad stage, showing fragmentation of nucleoli, small vesicles and pro-Ubisch 
bodies (white arrowheads). 

(D) Uninucleate microspore stage, showing abundant Ubisch bodies produced in the 
tapetum (white arrowheads). 

(E) Showing a free microspore with differentiated wall. 

(F) Showing white-line-centered lamella (white arrowhead) between the foot layer 
and the endexine. 

(G) Vacuolated microspore stage, the nucleus is positioned against the cell wall. 

(H) Showing the pericytoplasmic space formed beneath the exine. 

(Г) Showing cell layers of the anther wall and Ubisch bodies in the tapetum (white 
arrowheads). 

Scale bars in A, C-E, G and I represent 2 um, B represents 100nm, F and H represent 
200 nm. 


Figure 3 Ultra-thin sections of H. patens anthers. 
(A) Uninucleate microspore stage, showing the early-formed intine and associated 
rough endoplasmic reticulum. 
(B) Bicellular pollen stage, showing the generative nucleus surrounded by lipid 
droplets. 
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(C) Bicellular pollen stage, showing the thickened and more electron-dense intine. 

(D) Bicellular pollen stage, showing cell layers of the anther wall, fibrous thickenings 
in the endothecium and remnants of the tapetum. 

(E) Mature pollen grains released from the anther locule. 

(F) One pollen grain, showing the micro-reticulate ornamentation and protruding 
oncus.. 

Scale bars in A represents 500 nm; B represents 2 um; C represents 200 nm; D 
represents 10 um; E represents 50 um; F represents 5 um. 


Figure 4 DGE Libraries of four developmental stages. 

The white bars in this bar graph depict the numbers of genes mapped by clean reads in 
each library. The light grey bars depict the numbers of genes with RPKM value = 1 
in each library. The dark grey bars depict the numbers of genes with RPKM value > 
5 in each library. The black bars depict the numbers of genes with RPKM value — 
10 in each library. 


Figure 5 Expression patterns of highly expressed genes. 

The Venn diagram shows the constitution of genes highly expressed in at least one 
stage. The overlaps represent genes simultaneously highly expressed at two, three or 
four stages. 


Figure 6 Comparisons of adjacent developmental stages. 

The DGE libraries of four stages were compared to their preceding stage during 
anther development. MMC has no reference. The number of genes 2-fold 
up-regulated or down-regulated at FDR < 0.001 are plotted on the red or black bars in 
the graph. Among the up-regulated genes at each stage, the numbers of specifically 
enriched genes (RPKM value = 10) are plotted on the blue bars. 


Figure 7 Pairwise comparisons within four DGE libraries. 

Genes with similar expression patterns were grouped together to make 14 groups. The 
line graphs show the variation of expression levels at four stages of anther 
development. 


Figure 8 Stage-specifically expressed genes at four developmental stages. 

The gray bars in this bar graph depict the numbers of differentially expressed genes 
(genes in group 1 to 4 in Fig. 7). The black bars depict the numbers of specifically 
expressed genes (two-fold up-regulated and RPKM value = 10) in each library. 


Figure 9 Expression patterns of stage-specifically expressed genes. 

Hierarchical cluster diagram represents expression patterns of stage-specifically 
expressed genes of four stages. The numbers of genes under each cluster are showed 
at the right side. 
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Figure 10 Result of q-PCR analysis. 

The left Y axis represents RPKM value of each gene using RNA-Seq analysis. The 
right Y axis represents log» transformed relative transcript amount obtained by Q-PCR. 
The correlation co-efficient (r) between the two expression profiles is also showed. 


36 


Fig 2 


37 


Fig4 


С Total 
RPKM 2 1 


ЕСІ RPKM=5 
70735 71510 71248 702899991 RPKM » 10 


38 


Fig 5 


Fig 6 


Number of genes 


MMC 


TET 
(18,594) ~ (18,130) 


2686 867 
ВСР UNM 
(17,370) 794 (16,996) 

mum 2 fold up-regulated 
шеш enriched at the stage 
5000 Ell 2 fold down-regulated 4781 


2851 


3443 3591 


4305 
ММС : TET TET : UNM UNM : BCP 


39 


201707.00514v1 


chinaXiv 


Fig 7 


Group 1 
(1644) 
MMC TET UNM BCP 
30000 
20000 
Group 3 
(688) 10000 
0 
MMC TET UNM BCP 
200 
150 
Group5 100 
(20) 50 
0 — 
ММС ТЕТ УММ ВСР 
50 
40 
Group 7 30 
(2) 20 
10 


50 
40 
Group 9 30 
(7) 20 
10 

0 ~ =. 


MMC ТЕТ UNM ВСР 


1000 
800 
Group 11 600 
(ото) е 
200 
0 


MMC TET UNM ВСР 
000 


1500 
Group 13 4000 
(84) 500 


o| = SS 


MMC TET UNM BCP 


Group 4 
(3220) 


MMC TET UNM BCP 


50 

40 

Group 6 30 
(1) 20 
10 

0 


MMC TET UNM BCP 


50 
40 
Group 8 30 
e 1 
10 
0 


MMC TET UNM BCP 


300 
200 
Group 10 
(66) 100 
0 


MMC TET UNM BCP 


200 

150 

Group 12 100 
(27) 50 


ММС ТЕТ UNM BCP 


2000 

1500 

Group 14 1000 
0 


MMC TET UNM BCP 


40 


201707.00514v1 


chinaXiv 


Fig 8 


Number of genes 


Fig 9 


MMC 


MMC 


TET 


TET 


UNM 


Œ Differentially expressed 
ШИ Specifically expressed 


3220 


UNM BCP 


BCP 


a (1487) 


b (242) 


d poa 
e (31) 
f (90) 

| a (182) 


| с (95) 
| 
1 
| 


h (332) 


| i (185) 


j (246) 


| k (66) 
| 1 (199) 


m (237) 


n (375) 


41 


zm 
са 

[EY 

o 


эщел сін 


J8^9| uoissedxe элцееН 


201707.00514v1 


chinaXiv 


Unigene1497 Unigene8854 Unigene15163 Unigene23525 
700 x r-0.998 20 6; r=0.968 10.10 150: r-0.971 25  & r-0.960 0.6 
600 5; 10.08 125: 2.0 
500 16 4 100: 4 0.4 
496 12 з! 006 7. 15 3 . 
pod 8 2 0.04 60. 10 2 02 
100 4 1: 1 0.02 F 0.5 1 
0 0. 
ONMMCTET UNM BCP’ © "MMCTETUNM ВСР 00 OMMCTETUNM BCP”? 2 MMO TET UNM BGP”? 
i U 28354 
180, Unger e 2.8 600 Mors ^ Bu 1 о; келесе 15 Uno 06 
150: 24 500 400 8 
120: 400 1.0 04 
| 300 6 
90: 11.4 300 же ЖЕ 
60- 200 0.5 0.2 
30: 0.7 1 09 100 2 
0 0.0 0 0. 0.0 0.0 
MMC TET UNM BCP O MMC TÉT UNM BCP MMC TET UNM BCP O MMC TET UNM BCP 
Unigene29363 Unigene30147 Unigene30723 Unigane90846 
12 r=0.981 11.0 7.5 r=1.000 1.5 100; г=0.946 16 .995 16 
10 0.8 80 ] 
5 og 50 10 во 14 14 
4 04 $5 os 40 lo 2 
2 0.2 20: | 
0 0.0 00 о 0 0 0 
MMC TET UNM ВСР O MMC TET UNM BCP MMC TET UNM BCP MMC TET UNM BCP 
Unigene32349 Unigene32989 Unigene33351 Unigene34182 
5000! 98120994 170 350) | "0997 12 200, "98080931 2 300) т.0997 4% 
| 6 300 1.0 20 
1 200 I 200 15 
| Eos 0.4 100 1 
] 50 0.2 | 5 
?MMCTETUNM ВСРо О MMIC TET ОЧМ ВСР. ° MMC TET UNM ВСР O MMC ТЕТ UNM ВСР” 
Unigene34885 CL3181-1 CL9729-3 01126951, 
1800 пепео 500 600 r-0.997 120 7 r-0.909 0.16 900 250 
1200 300 400 80 : а 150 
900 300 60 
600 200 500 Mo 3 100 
300 100 1e 20 1 = 
0 0 0 


MMC TET UNM ВСР” 


0 ммс TET UNM ВСР 


42 


MMC ТЕТ UNM BoP” 


RNA-Seq 


? MMC ТЕТ UNM BOP 
Q-PCR 


